An introduction of the problem domain and a description of the variable(s) you are choosing to analyze (and why!)
Write a summary paragraph of findings that includes the 5 values calculated from your summary information R script
These will likely be calculated using your DPLYR skills, answering questions such as:
Feel free to calculate and report values that you find relevant. Again, remember that the purpose is to think about how these measure of incarceration vary by race.
Who collected the data?
How was the data collected or generated?
Why was the data collected?
How many observations (rows) are in your data?
r rows.How many features (columns) are in the data?
What, if any, ethical questions or questions of power do you need to consider when working with this data? - Some questions to consider when working with this data is that it can be considered sensitive information. Lots of individuals have been sent to prison/jail and it can be a huge thing in their lives which means analysis on this data should not be taken lightly.
What are possible limitations or problems with this data? (at least 200 words)
aapi_pop_15to64 there are about 153,811
rows. Out of those about 62,780 are missing. This means that about 41%
of the data in this column are missing, which could make data analysis
much more difficult.total_pop_15to64 which tells us how many people are in
between the ages of 15 and 64. However, it does not give us information
on a specific age. It also does not give us a clear idea on the number
of individuals who are older than 64 or younger than 15.other_race_prison_pop means. The
documentation talks about how it is other or unknown racial categories
but that does not give much information to work on. This also heavily
limits individuals because if they are not asian or pacific islander,
black, latinx, native american, or white, then they have to be
classified as ‘other’Chart:
Description and Why:
Include a chart. Make sure to describe why you included the chart, and what patterns emerged
The second chart that you will create and include will show how two different (continuous) variables are related to one another. Again, think carefully about what such a comparison means and what you want to communicate to your user (you may have to find relevant trends in the dataset first!). Here are some requirements to help guide your design:
Chart:
## Warning: Ignoring unknown parameters: text
Description and Why:
This is a map of all of Washington’s county in the year 2000 and how the prisons total populations compare.
This was included
Include a chart. Make sure to describe why you included the chart, and what patterns emerged